Clustering our observations by marine ecoregions, we can determine a clustered robust covariance matrix estimator analogous to the robust adjusted White estimator: \[\mathbf{\widehat{V}_{\widehat\beta_{cluster}}} = a_n(\mathbf{X'X})^{-1}\widehat\Omega_n (\mathbf{X'X})^{-1}\] where \[\widehat\Omega_n = \sum_{g=1}^G \mathbf{X_g'\hat e_g \hat e_g' X_g}\] and \[a_n = \left(\frac{n-1}{n-k}\right)\left(\frac{G}{G-1}\right)\]
We can also calculate an alternative cluster-robust covariance matrix estimator based on cluster-level prediction errors and a leave-one-cluster-out type of process, as detailed in Hansen: \[\mathbf{\widetilde{V}_{\widehat\beta_{cluster}}} = (\mathbf{X'X})^{-1}\left(\sum_{g=1}^G \mathbf{X'}_g \widetilde e_g \widetilde e_g' \mathbf{X}_g \right) (\mathbf{X'X})^{-1}\]
where
\[\widetilde e_g = \mathbf{y}_g - \mathbf{X}_g \widehat \beta_{-g}\]
Calculating standard errors from the square root of the diagonal of the covariance matrix estimators based upon a simple marine ecoregion clustering:
\[s(\widehat\beta_{j}) = \sqrt{\mathbf{\widehat{V}_{\widehat\beta_j}}} = \sqrt{[\mathbf{\widehat{V}_{\widehat\beta}}]_{jj}}\]
Calculate standard errors based on adjusted clusters; this should aggregate ecoregions that are too small and divide ecoregions that are too large. For these let’s focus on the entire range, not just the coastal range; therefore let’s use the Longhurst provinces as a clustering input.
Not filtered to n_spp >= 5…